861 research outputs found
Complexity of pattern classes and Lipschitz property
Rademacher and Gaussian complexities are successfully used in learning theory for measuring the capacity of the class of functions to be learned. One of the most important properties for these complexities is their Lipschitz property: a composition of a class of functions with a fixed Lipschitz function may increase its complexity by at most twice the Lipschitz constant. The proof of this property is non-trivial (in contrast to the other properties) and it is believed that the proof in the Gaussian case is conceptually more difficult then the one for the Rademacher case. In this paper we give a detailed prove of the Lipschitz property for the Rademacher case and generalize the same idea to an arbitrary complexity (including the Gaussian). We also discuss a related topic about the Rademacher complexity of a class consisting of all the Lipschitz functions with a given Lipschitz constant. We show that the complexity is surprisingly low in the one-dimensional case. The question for higher dimensions remains open
Input Fast-Forwarding for Better Deep Learning
This paper introduces a new architectural framework, known as input
fast-forwarding, that can enhance the performance of deep networks. The main
idea is to incorporate a parallel path that sends representations of input
values forward to deeper network layers. This scheme is substantially different
from "deep supervision" in which the loss layer is re-introduced to earlier
layers. The parallel path provided by fast-forwarding enhances the training
process in two ways. First, it enables the individual layers to combine
higher-level information (from the standard processing path) with lower-level
information (from the fast-forward path). Second, this new architecture reduces
the problem of vanishing gradients substantially because the fast-forwarding
path provides a shorter route for gradient backpropagation. In order to
evaluate the utility of the proposed technique, a Fast-Forward Network (FFNet),
with 20 convolutional layers along with parallel fast-forward paths, has been
created and tested. The paper presents empirical results that demonstrate
improved learning capacity of FFNet due to fast-forwarding, as compared to
GoogLeNet (with deep supervision) and CaffeNet, which are 4x and 18x larger in
size, respectively. All of the source code and deep learning models described
in this paper will be made available to the entire research communityComment: Accepted in the 14th International Conference on Image Analysis and
Recognition (ICIAR) 2017, Montreal, Canad
Cross-Lingual Classification of Crisis Data
Many citizens nowadays flock to social media during crises to share or acquire the latest information about the event. Due to the sheer volume of data typically circulated during such events, it is necessary to be able to efficiently filter out irrelevant posts, thus focusing attention on the posts that are truly relevant to the crisis. Current methods for classifying the relevance of posts to a crisis or set of crises typically struggle to deal with posts in different languages, and it is not viable during rapidly evolving crisis situations to train new models for each language. In this paper we test statistical and semantic classification approaches on cross-lingual datasets from 30 crisis events, consisting of posts written mainly in English, Spanish, and Italian. We experiment with scenarios where the model is trained on one language and tested on another, and where the data is translated to a single language. We show that the addition of semantic features extracted from external knowledge bases improve accuracy over a purely statistical model
Right for the Right Reason: Training Agnostic Networks
We consider the problem of a neural network being requested to classify
images (or other inputs) without making implicit use of a "protected concept",
that is a concept that should not play any role in the decision of the network.
Typically these concepts include information such as gender or race, or other
contextual information such as image backgrounds that might be implicitly
reflected in unknown correlations with other variables, making it insufficient
to simply remove them from the input features. In other words, making accurate
predictions is not good enough if those predictions rely on information that
should not be used: predictive performance is not the only important metric for
learning systems. We apply a method developed in the context of domain
adaptation to address this problem of "being right for the right reason", where
we request a classifier to make a decision in a way that is entirely 'agnostic'
to a given protected concept (e.g. gender, race, background etc.), even if this
could be implicitly reflected in other attributes via unknown correlations.
After defining the concept of an 'agnostic model', we demonstrate how the
Domain-Adversarial Neural Network can remove unwanted information from a model
using a gradient reversal layer.Comment: Author's original versio
Classifying Crises-Information Relevancy with Semantics
Social media platforms have become key portals for sharing and consuming information during crisis situations. However, humanitarian organisations and affected communities often struggle to sieve through the large volumes of data that are typically shared on such platforms during crises to determine which posts are truly relevant to the crisis, and which are not. Previous work on automatically classifying crisis information was mostly focused on using statistical features. However,
such approaches tend to be inappropriate when processing data on a type of crisis that the model was not trained on, such as processing information about a train crash, whereas the classifier was trained on floods, earthquakes, and typhoons. In such cases, the model will need to be retrained, which is costly and time-consuming. In this paper, we explore the impact of semantics in classifying Twitter posts across same, and different, types of crises. We experiment with 26 crisis events, using a hybrid system that combines statistical features with various semantic features extracted from external knowledge bases. We show that adding semantic features has no noticeable benefit over statistical features when classifying same-type crises, whereas it enhances the classifier performance by up to 7.2% when classifying information about a new type of crisis
Second-Generation Objects in the Universe: Radiative Cooling and Collapse of Halos with Virial Temperatures Above 10^4 Kelvin
The first generation of protogalaxies likely formed out of primordial gas via
H2-cooling in cosmological minihalos with virial temperatures of a few 1000K.
However, their abundance is likely to have been severely limited by feedback
processes which suppressed H2 formation. The formation of the protogalaxies
responsible for reionization and metal-enrichment of the intergalactic medium,
then had to await the collapse of larger halos. Here we investigate the
radiative cooling and collapse of gas in halos with virial temperatures Tvir >
10^4K. In these halos, efficient atomic line radiation allows rapid cooling of
the gas to 8000 K; subsequently the gas can contract nearly isothermally at
this temperature. Without an additional coolant, the gas would likely settle
into a locally gravitationally stable disk; only disks with unusually low spin
would be unstable. However, we find that the initial atomic line cooling leaves
a large, out-of-equilibrium residual free electron fraction. This allows the
molecular fraction to build up to a universal value of about x(H2) = 10^-3,
almost independently of initial density and temperature. We show that this is a
non--equilibrium freezeout value that can be understood in terms of timescale
arguments. Furthermore, unlike in less massive halos, H2 formation is largely
impervious to feedback from external UV fields, due to the high initial
densities achieved by atomic cooling. The H2 molecules cool the gas further to
about 100K, and allow the gas to fragment on scales of a few 100 Msun. We
investigate the importance of various feedback effects such as
H2-photodissociation from internal UV fields and radiation pressure due to
Ly-alpha photon trapping, which are likely to regulate the efficiency of star
formation.Comment: Revised version accepted by ApJ; some reorganization for clarit
Forecasting: Adopting the Methodology of Support Vector Machines to Nursing Research
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/71569/1/j.1741-6787.2006.00062.x.pd
Cross Pixel Optical Flow Similarity for Self-Supervised Learning
We propose a novel method for learning convolutional neural image
representations without manual supervision. We use motion cues in the form of
optical flow, to supervise representations of static images. The obvious
approach of training a network to predict flow from a single image can be
needlessly difficult due to intrinsic ambiguities in this prediction task. We
instead propose a much simpler learning goal: embed pixels such that the
similarity between their embeddings matches that between their optical flow
vectors. At test time, the learned deep network can be used without access to
video or flow information and transferred to tasks such as image
classification, detection, and segmentation. Our method, which significantly
simplifies previous attempts at using motion for self-supervision, achieves
state-of-the-art results in self-supervision using motion cues, competitive
results for self-supervision in general, and is overall state of the art in
self-supervised pretraining for semantic image segmentation, as demonstrated on
standard benchmarks
Automatic detection of limb prominences in 304 A EUV images
A new algorithm for automatic detection of prominences on the solar limb in 304 A EUV images is presented, and results of its application to SOHO/EIT data discussed. The detection is based on the method of moments combined with a
classifier analysis aimed at discriminating between limb prominences, active regions, and the quiet corona. This classifier analysis is based on a Support Vector Machine (SVM). Using a set of 12 moments of the radial intensity profiles, the algorithm performs well in discriminating between the above three categories of limb structures, with a misclassification rate of 7%. Pixels detected as belonging to a prominence are then used as starting point to reconstruct the whole prominence by morphological image processing techniques. It is planned that a catalogue of limb prominences identified in SOHO and STEREO data using this method will be made publicly available to the scientific community
Towards Emotion Recognition: A Persistent Entropy Application
Emotion recognition and classification is a very active area of research. In
this paper, we present a first approach to emotion classification using
persistent entropy and support vector machines. A topology-based model is
applied to obtain a single real number from each raw signal. These data are
used as input of a support vector machine to classify signals into 8 different
emotions (calm, happy, sad, angry, fearful, disgust and surprised)
- …